Improving visual features for lip-reading

نویسندگان

Yuxuan Lan

Barry-John Theobald

Richard Harvey

Eng-Jon Ong

Richard Bowden

چکیده

Automatic speech recognition systems that utilise the visual modality of speech often are investigated within a speakerdependent or a multi-speaker paradigm. That is, during training the recogniser will have had prior exposure to example speech from each of the possible test speakers. In a previous paper we highlighted the danger of not using different speakers in the training and test sets, and demonstrated that, within a speakerindependent configuration, lip-reading performance degrades dramatically due to the speaker variability encoded in the visual features. In this paper, we examine feature improvement techniques to reduce speaker variability. We demonstrate that, by careful choice of technique, the effects of inter-speaker variability in the visual features can be reduced, which improves significantly the recognition accuracy of an automated lip-reading system. However, the performance of the lip-reading system still is significantly below that of acoustic speech recognition systems, and an analysis of the confusion matrices generated by the recogniser suggests this largely is due to the number of deletions apparent in a visual-only system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Lip-reading with Feature Spac Audio-Visual Speech R

In this paper we investigate feature space transforms to improve lip-reading performance for multi-stream HMM based audio-visual speech recognition (AVSR). The feature space transforms include non-linear Gaussianization transform and feature space maximum likelihood linear regression (fMLLR). We apply Gaussianization at the various stages of visual front-end. The results show that Gaussianizing...

متن کامل

Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods

For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...

متن کامل

لب‌خوانی: روش جدید احراز هویت در برنامه‌های کاربردی گوشی‌های تلفن همراه اندروید

Today, mobile phones are one of the first instruments every individual person interacts with. There are lots of mobile applications used by people to achieve their goals. One of the most-used applications is mobile banks. Security in m-bank applications is very important, therefore modern methods of authentication is required. Most of m-bank applications use text passwords which can be stolen b...

متن کامل

An Efficient Lip-reading Method Using K-nearest Neighbor Algorithm

Many studies have been carried out on lip reading, most of those works are based on color images, while some essential features might not be obtained, like inner lip information. In this paper, RGBD camera will be introduced for improving the recognition rate of lip reading. We try to complete lip reading through using only gray-scale images. Thirteen groups of words are given, and we present e...

متن کامل

Decoding visemes: improving machine lipreading (PhD thesis)

This thesis is about improving machine lip-reading, that is, the classification of speech from only visual cues of a speaker. Machine lip-reading is a niche research problem in both areas of speech processing and computer vision. Current challenges for machine lip-reading fall into two groups: the content of the video, such as the rate at which a person is speaking or; the parameters of the vid...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Improving visual features for lip-reading

نویسندگان

چکیده

منابع مشابه

Improving Lip-reading with Feature Spac Audio-Visual Speech R

Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods

لب‌خوانی: روش جدید احراز هویت در برنامه‌های کاربردی گوشی‌های تلفن همراه اندروید

An Efficient Lip-reading Method Using K-nearest Neighbor Algorithm

Decoding visemes: improving machine lipreading (PhD thesis)

عنوان ژورنال:

اشتراک گذاری